Planning method and system for use in cognitive programs

ABSTRACT

A system for achieving a desired goal in a domain. The system may comprise a device operable to receive information and simulate the domain therefrom; a device operable to simulate one or more effects due to one or more operators; a device operable to specify a number of items and/or classes of items and whether each item and/or each class of items is an affectable obstacle or a non-affectable obstacle; a device operable to automatically generate a candidate plan to achieve the desired goal by utilizing the simulated domain and the simulated effect(s), wherein the candidate plan could involve one or more affectable obstacles but does not involve any non-affectable obstacles; and a device operable to automatically refine the candidate plan to change at least one of the affectable obstacles involved in the candidate plan.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.12/655,256 filed on Dec. 24, 2009, which is a continuation of U.S.application Ser. No. 11/403,487 filed on Apr. 13, 2006 now U.S. Pat. No.7,640,221, which claims the benefit of the filing date of U.S.Provisional Patent Application No. 60/671,660 filed Apr. 15, 2005, thedisclosures of which are hereby incorporated herein by reference.

BACKGROUND OF THE INVENTION

The present invention relates to a method and system for utilizingcomputers to discover plans that achieve design goals or sets of designgoals in domains.

Applications for systems may include programs that schedule work infactories or plants, that control complex systems such as factories orpower plants to maintain design criteria, that plan routing for fleetsof trucks, that find plans how to economically and rapidly travel to adesired destination, or that play games and solve puzzles. Suchapplications typically involve a particular given domain. For example,if one were designing a program to play chess, the domain would bechess, or if one were controlling a factory, the domain would be thatparticular factory, including the particular layout of the particularmachines therein. Aspects of the domain may be represented in a computerprogram or in data readable by a computer program, so that a computerprogram can be used to construct plans or to aid humans to constructplans to achieve a variety of goals within the domain. For example, onemight want the program to be able to win from a variety of differentchess positions, or one might want the program to control a factory tooptimally produce a variety of different products, with the particulardesign goal (say whether to produce 100,000 pencils or 50,000 pens and10,000 pencils) changing in different runs of the program.

Present day automated systems for finding plans may utilize eithersearch or planners. Both of these methods are typically based on domainindependent algorithms, and exploit little of the structure of aparticular domain. Search programs typically only have knowledge abouttheir goal in the form of an operator that recognizes when it has beenachieved, and a heuristic function that may indicate distance from thegoal. Planners typically exploit knowledge encoded in what is called theSTRIPS language, but this also may fail to exploit useful structure ofparticular domains. As a result, these methods often search through manypossible sequences of operators that are not relevant to solvingspecific obstacles, and thus are often not efficient enough to solveproblems that one wishes to solve in a reasonable amount of time.

However, many domains of interest have structure which might beefficiently exploited for solution. For example, many problems ofinterest involve routing in two or three dimensions, so that aspects ofthe topology of two and three dimensions might be exploited in findingplans. Many domains of interest contain objects that may be affected inways characteristic of the type of object, and which may sometimespresent obstacles to plans. Many domains of interest containcharacteristic kinds of obstacles to implementing available actions, andthese obstacles may in many cases be fixed by solutions characteristicof the kind of obstacle.

SUMMARY OF THE INVENTION

In accordance with an aspect of the present invention, a system forachieving a desired goal in a domain, in which the domain has one ormore operators associated therewith, is provided. The system maycomprise a device operable to receive information pertaining to thedomain and simulate the domain therefrom; a device operable to simulateone or more effects due to the one or more operators associated with thedomain; a device operable to specify a number of items and/or a numberof classes of items in the domain and whether each item and/or eachclass of items is an affectable obstacle wherein at least one of the oneor more operators can cause a change thereto or a non-affectableobstacle wherein the one or more operators can not cause a changethereto; a device operable to automatically generate a candidate plan toachieve the desired goal by utilizing the simulated domain and thesimulated effect(s), wherein the candidate plan could involve one ormore affectable obstacles but does not involve any non-affectableobstacles; and a device operable to automatically refine the candidateplan to change at least one of the affectable obstacles involved in thecandidate plan.

The present invention may be accomplished by a structured program (or adevice that would function similarly to the operation of such aprogram). The program may comprise (or call as executable modules) anumber of domain specific elements including one or more domainsimulators, one or more operators representing actions that may betaken, a number of methods for recognizing a domain state wherein anoperator is immediately applicable, a number of methods for recognizinga domain state wherein an operator might be applied if one or moreaffectable obstacles were fixed, and a number of methods of fixingobstacles that obstruct operators. The program may first use a providedmeans to find a candidate plan or plans, and then may refine such planor plans. A candidate plan may comprise a sequence of subgoals,typically of applying operators, wherein the program proposes that ifthose operators may be applied in that sequence, a goal state willresult. One means of finding candidate plans is to employ a searchtechnique wherein enabled operators are applied at previously reachedstates in simulation until a goal state is reached, wherein an operatoris enabled in a state if a method recognizes either that it isapplicable in that state, or that it might be applicable in that stateif affectable obstacles were fixed, changed, or overcome. Means ofrefining candidate plans may comprise an ordering system to processcandidate plans iteratively in an optimized order (typically best first,where best may be measured by an accounting procedure that estimates thecost of the plan) and a means of resolving obstacles. Plans aretypically processed in sequential order so that as the next unresolvedelement is being processed, the state of the domain is available insimulation (because all earlier operations have been resolved) so thatrelevant objects and operations may be recognized within the simulation.A means of resolving obstacles may be embodied in the followingprocedure. If a specific method is supplied for dealing with anobstacle, said method may be invoked. Otherwise a default method may beapplied which may comprise a search in which operators are applied thataffect the obstacle. If it is proposed to apply an operator to affect anobstacle and there are obstacles to applying that operator, the sameapproach may first be applied to solve those obstacles.

The present invention may provide a technique of achieving design goalsin domains, which may be tailored to the structure of the domain, andthus may be much faster and able to address more complex problems ascompared to current techniques.

The present invention may provide a technique of exploiting theparticular structure of a domain or domains so that only a relevantoperator or operators will be considered in searches.

The present invention may provide a technique that may exploit thestructure of a planning problem or problems in two or three dimensions.

This process may be highly efficient because (a) it may invoke methodsspecific to the domain for solving problems, (b) it may find candidateplans efficiently by ignoring obstacles that may later be corrected, and(c) it may only consider relevant operators, that is operators may onlybe considered in the refinement process if (a) they refine a candidateplan that has been proposed by the mechanism, thus implying reason tobelieve it may succeed and (b) they are relevant to fixing an obstacleto that candidate plan.

Other improvements are also described, including a means for utilizing amodule that detects localized configurations that prevent achievement ofa goal state in order to refine and rule out candidate plans; a meansfor recognizing when previously affected objects are obstacles to acandidate plan and automatically suggesting an alternative candidateplan that may avoid this problem; and a means for finding and usingconstraints on the order in which goals are solved, for circumstanceswhere multiple goals are presented.

DESCRIPTION OF THE FIGURES

A more complete appreciation of the subject matter of the presentinvention and the various advantages thereof can be realized byreference to the following detailed description in which reference ismade to the accompanying drawings wherein like reference numbers orcharacters refer to similar elements.

FIG. 1 is a diagram of a flow chart of a planner in accordance with anembodiment of the present invention;

FIG. 2 is a diagram of a flow chart of a method of finding candidateplans;

FIG. 3 is a diagram of a flow chart of a method for resolving unresolvedsubgoals in a candidate plan;

FIG. 4 is a diagram of a flow chart of a default method for processing asubgoal when it specifies an action;

FIG. 5 is a diagram of a flow chart of a routine for clearing obstaclesor deadlocks;

FIG. 6 is a diagram of a flow chart of a mark handling routine;

FIG. 7 is a diagram of a flow chart to which reference will be made inexplaining how a present method(s) or module(s) embodying such method(s)may be used within a module constructor(s).

FIG. 8 is a diagram of a flowchart of a make action routine;

FIG. 9 is a diagram of a flowchart of a deadlock detector;

FIG. 10 is a diagram of a domain;

FIG. 11 is a diagram of a flow chart to which reference will be made inexplaining a module constructor;

FIG. 12 is a diagram of a flow chart to which reference will be made inexplaining an algorithm to find constraints on the order in which goalsare solved when there are multiple goals; and

FIG. 13 is a diagram of a system in accordance with an embodiment of thepresent invention.

DETAILED EMBODIMENTS

The present invention may be embodied in a computer program for use witha computer or, alternatively, may be embodied in a device that wouldfunction equivalently to the execution of such a computer program. Ineither situation, the present invention may, given a description of agoal or goals to be achieved in a domain, automatically output asequence of actions to be taken to achieve the goal, or report failure.Such program or device may alternatively be referred to herein as “theprogram” or “the planner”. The program may be provided with (typicallyas a callable module) one or more simulations of the domain. Simulationsmay act at various levels of detail, or simulate various aspects of thedomain, potentially ignoring other aspects. The domain may includevarious kinds of objects and various kinds of goals and also variouskinds of actions within the domain may be available. The program may beprovided with code that recognizes which actions are available at agiven domain state, and that simulates effects of these actions on thedomain, such as effects on objects in the domain. Actions made availableto the planner may include macro-actions, that is, executable code thatimplements a number of calculations and actions, which may be referredto herein as “operators”. Operators may be associated with executablecode that recognizes when the operator may be applied in the domainsimulation and implements the effects of the associated action insimulation. The program may also be provided with executable code thatmay recognize objects or classes of objects within the domain that mayunder some circumstances be affectable, changeable, or fixable by theavailable actions, and that may recognize objects or classes of objectsthat are not affectable, changeable, or fixable. Executable code (withinor called by the program) may further identify an operator or operatorsas potentially available from a given domain state if an affectableobstacle or obstacles (such as affectable objects) blocking the actionwere removed. Operators may thus also be associated with executable codethat recognizes or specifies when the operator could be applied ifcertain affectable obstacles blocking the action were removed orappropriately dealt with, and executable code that calculates effectsthat would then result in simulation, and executable code thatcalculates means for correcting the obstacle. Operators may optionallybe further associated with executable code that specifies a method forremoving, correcting, solving, or overcoming obstacles to applying theoperator. Affectable objects may be associated with executable code thatspecifies methods for affecting the object, for example by invokingappropriate operators. Goal types or classes may be associated withexecutable code that embodies methods useful for achieving such a goal.Typically, objects and operators and goals and domains may come inclasses that share certain methods implemented in executable code, andany given specific object or action or goal or domain may be an instanceof such a class.

In computer science, for example in the Python programming language, theterm class may describe a software object that typically is used torepresent or model different items of a similar kind and the term methodmay describe executable code that may be applied to instances of aclass.

Many real world domains contain actual structure, such as particularobjects and kinds of objects, that behave in characteristic ways. Thepresent invention provides a technique for exploiting the structure ofparticular domains so as to efficiently solve design problems withinthem. In order to facilitate this, as described above the user maysupply software objects that model, mirror, reflect, or exploit certainaspects of the actual structure of the domain, and that may calculateeffects in simulation. Such software objects may be viewed as definingproperties of the domain simulation. An object may thus be definedwithin the program as a particular piece of supplied executable codethat may model a physical object or feature or class of objects in thedomain. An operator may thus be defined with the program as a particularpiece of supplied executable code that may model an action ormacro-action that might be available in the domain. An object may beconsidered affectable if executable code for identifying affectableobjects identifies it as such, or alternatively if any availableoperators effect it or its simulation, which may be the case when actualactions available in the domain may change in some way the modeledphysical object or feature. An object may be considered an obstacle toan operator in a state if executable software identifies it as such,which may occur when a real domain object or feature modeled prevents orobstructs applying the real actions modeled in the modeled domain state.An operator may be considered to be applicable if obstacles were removedif executable software for identifying when it would be applicable ifobstacles were removed so identifies it, and said software may specify asimulated domain state or changes to a simulated domain state that wouldresult.

One way of thinking about this is using a metaphor in which the domainis represented by a simulation, and code is supplied to model aspects ofthe world, including contra-factual aspects (such as modeling theeffects of actions that would only be applicable if certain obstacleswere removed). Exploiting this metaphor, actions and items such asobjects within the program may sometimes be described as if the actionwere occurring in the domain.

For example, if the domain is navigating around an office building, asshown in FIG. 10, affectable objects may include windows (such aswindows 10) and doors that can be opened or closed (such as doors 3, 6,7, and 8), or furniture (such as chairs 1, 2, 4, and 5) that may bemoved, and unaffectable objects may include walls (assuming theconditions of the domain specify that no actions are available that teardown or rebuild walls). An action that might be available from a givendomain state is to walk south one meter, which might be currentlyblocked by an obstacle comprising a closed door but said action might beavailable if the door were first opened. Such an action might have theeffect on a simulation of translating a simulated man one meter southwithin the simulated office. An obstacle to opening the door might be ifit is locked. The program may thus include executable code forsimulating walking around the office (including operators such as walksouth one meter, walk east one meter), executable code for recognizingwhen walking south one meter would be feasible if an obstacle wereremoved and for simulating the location that would be reached. Classesof obstacles might include doors, and executable code might be providedfor correcting a locked door, which could invoke a module for searchingfor a key in proximate desk drawers and using it to unlock the door. Theoperator for walking might be able to access an executable method foropening doors, or an executable method for searching for keys andunlocking doors and then opening them. A particular locked door might bean instance of the class of locked doors, in which case the same methodof searching for a key and using it to open the door might be availablefor all doors. The planner may be able to solve problems of the form:given a domain simulation representing a particular office building,given any desired location within the office building, automaticallyfind a way of getting to that location or report that it isinaccessible.

The planner may automatically construct plans in two steps, as shown inFIG. 1. First after inputs such as those pertaining to the domain,operator(s), item(s) and/or classes thereof, an initial of the domain, adesired goal(s) and so forth (as hereinbelow more fully described) maybe provided by a user(s) or from a memory, an initial list of candidateplans may be automatically constructed (STEP S110) without anyadditional input(s) from the user(s). These plans may consist of asequence of subgoals to be achieved, such as subgoals of applying aparticular action at that point in the plan. The planner proposes thatif this sequence of subgoals can be achieved, it may result insatisfying the design goals. Second (STEP S120 and below) the plannermay automatically refine these candidate plans without any furtherinput(s) from the user(s), seeking to generate a concrete or final planthat has been verified in simulation as resulting in a configurationsatisfying the design goals.

“Candidate plans” will sometimes for convenience be referred to simplyas plans. Candidate plans may contain subgoals of performing operators,wherein the operators may be obstructed inasmuch as the program mayidentify that they could not be performed at the simulated domain state,which may model the fact that the associated actions would be impossibleat the corresponding state of the actual domain, but wherein the programmay identify or specify that the actions would be applicable if certainobjects were removed or appropriately dealt with. Such objects may beconsidered obstacles, and the candidate plan may be said to involvethem, and it may be considered that the obstacles are changed, fixed,overcome or solved when a refined candidate plan is generated, typicallyincluding additional operators inserted for the purpose of dealing withthe obstacles, so that the operator may be applied and/or the subgoalrealized. If it is recognized that a subgoal can not possibly beresolved unless a particular obstacle is overcome, then it may beconsidered that the obstacle should be overcome. The automated processof considering candidate plans and modifying them or creating newcandidate plans based on them in ways directed toward changing, fixing,overcoming or solving the obstacles may be considered refining thecandidate plan.

An embodiment of the present invention for automatically constructingthe initial list of candidate plans is shown in FIG. 2. The set ofstates reached may be initially set to include only the initial state(STEP S210). (FIG. 2 assumes that the initial state does not satisfy thegoals, a check could be provided and return success if so.)

Then the following process may be iterated. If a time limit has beenplaced on the computation, it may be checked to see if it has beenexceeded (STEP S220). If so, it may be terminated. If no, whether thereare enabled operators that have not yet been applied at states that havebeen reached may be determined at STEP S230. If not, the processterminates. If so, processing may be proceed to STEP S240 wherein theprocess applies one or more such operators, keeping track of the statesreached and the sequences of operators that reached those states fromthe initial state. Next, whether the goal conditions are satisfied inany new states reached may be determined at STEP S250. If so, processingmay proceed to STEP S260 wherein the sequence of operators that reachedthis state may be added to a list. When the process terminates, thislist may be returned as the list of candidate plans. If it is determinedthat there are enough candidate plans at STEP S270, the program mayterminate, otherwise it may loop back to examine the effects of enabledoperators at reached states that have not yet been applied.

Action operators may be deemed enabled if they are among a set of actionoperators supplied to the system as relevant to achieving the goals, andif they can either be directly applied in the position, or if it isrecognized or specified, or identified that they could possibly beapplied in the state if affectable obstacles were removed or corrected.

This recognition may be made by a method associated with the operatorclass that recognizes when the operator is potentially applicable andimplements in simulation effects that the operator would have if it wereapplicable and applied.

Note that the action operators considered enabled may differ from theactions that would ordinarily be considered applicable. Action operatorsmay often have prerequisites before they can be used, for example youcan't walk through a door unless it is open, and ordinary applicationsof dynamic programming may typically apply actions that are actuallyapplicable. Included in this step may be some actions that may not beimmediately applicable. By doing so, shorter sequences of proposedactions may be found that result in achieving the goal criteria, makingthe search much more efficient. However, the sequence of proposedactions may not actually be feasible, thus requiring the second step ofthe planner: automatically refining the proposed plans to see whetherthe obstacles can in fact be removed, overcome or solved and to proposeplans for doing so.

The choice of which enabled operators to apply in step S240 may be madein different ways. One way is to apply all enabled operators, reachingas many new states as possible from the states newly reached at theprevious iteration. Alternatively action operators applied in step S240may be suggested by a domain-specific module as appropriate to thedomain and situation, or may be selected at random among enabledoperators, or in other ways from among enabled operators.

As an example, consider the office building shown in FIG. 10 and assumethat someone wishes to reach location B starting from location A, withactions stepping north south east or west. Starting from location A,there are four (4) enabled operators, that is, movement in a northdirection, movement in a east direction, movement in a south direction,and movement in a west direction (even though walking West hits chair(1), since chairs are regarded as affectable objects). Continuing fromlocations reached, one may apply operators that do not hit walls, andhave not already been applied from that location. Eventually one mightfind one or more candidate paths to location B, even if these paths gothrough locked doors. The process may return a list of such paths, eachpath consisting of a sequence of operators. In FIG. 10, the process mayreturn a path going through open door (7) and the locked door (8), aswell as a path going through chair (2), the closed door (3), chair (5),and open door (6).

Alternatively, this planner might be further supplied with macrooperators that take it to the adjoining room.

Very short plans may be found in this case, and may again be refined asdescribed above when obstacles to the macro action (e.g. the lockeddoor) were dealt with.

The actual “state” defined or utilized with the present invention (suchas in the above discussion and in FIG. 2) may be chosen appropriately tothe particular domain and goals. In problems such as the office problem,where the goal is to find a path from one location to another location,the state will typically be characterized by a location reached. Inproblems where the goal is to find a sequence of actions moving acollection of objects to a collection of goal locations (such as aplanner to route a fleet of trucks) the state may be characterized by aset of locations (the locations of all trucks). In more general classesof domains, the state may be characterized in other ways.

In some embodiments of the planner, the process shown in FIG. 2 may beapplied in a “lazy” fashion. That is, rather than finding all possiblecandidate plans, the process automatically terminates at step S270 whenone has found a number of candidate plans. At this termination step thestate of the whole process may be stored, so that as it is decided thatadditional candidate plans should be examined (for example, because moredetailed analysis of the ones previously found shows them to be lesspromising) a number of additional candidate plans can be generated bysimply resuming the process.

After candidate plans have been generated, the candidate plans may beautomatically scored according to a predetermined criteria such as theestimated cost it will take to achieve them. The estimated cost may be aprovided measure that allows one to sort the plans and examine the bestfirst, according to a measure that is appropriate to the domain andleads to an effective search. In a typical situation, the cost of a planmay be set to be the sum of a cost attributed to the actions in theplan, plus an estimate (preferably a lower bound) on the cost of actionsto remove obstacles in the way of the plan. It may be preferred to use acost measure that increases as the number of obstacles remainingincreases, since this will preferentially search short plans. There aremany more conceivable long sequences of actions than short ones.

If a cost is employed that measures the length of the plan, sequences ofactions longer than shortest successful sequence may be avoidedentirely. The cost may also reflect an estimate of the difficulty ofdealing with obstacles, or other measures of the desirability andlikelihood of success of the plan. Such estimates may be provided bymethods associated with the class of a particular obstacle, or with theclass of a particular operator blocked by the obstacle.

As shown in FIG. 1, from step S120 and below, the planner mayiteratively work on the plan with lowest estimated cost. As firstsupplied by the method shown in FIG. 2, candidate plans consist of asequence of subgoals, for example of applying certain actions, but theactions may not be feasible due to obstacles. The planner mayautomatically refine such a plan by going through it in time order, andexpanding the first unresolved element as indicated in step S130. Atstep S140 a determination is made as to whether such an expansionresults in a finished plan that solves the problem. If yes, this isoutput as indicated in step S150. An example of a finished plan would bea concrete series of actions that in simulation take the domain from theinitial state to a state achieving the design goals. Otherwise, if thedetermination at step S140 is negative, processing may proceed to stepS160 to determine if candidate plans remain so that they may be furtherconsidered. If no candidate plans remain the planner may report failureas indicated in step S170.

The expansion of an unresolved subgoal (indicated at step S130 inFIG. 1) is further detailed in FIG. 3. Initially, candidate plans mayconsist of a sequence of subgoals. As these are processed, the candidateplan may consist of an initial sequence of actions that have been foundto be applicable in simulation, followed by a remaining sequence ofsubgoals. A subgoal that has not been resolved into one or more concreteactions achieving it may be considered unresolved. For example, asubgoal of performing an operator which is blocked by obstacles may beconsidered unresolved, but as the plan is refined further operators maybe added that clear the obstacle, and when a plan is generated in whichthe operator is no longer blocked by obstacles and may be performed, thesubgoal may be considered solved or resolved.

At step S310, a determination is made as to whether the next subgoal isalready solved. If the determination is yes, it may be deleted from theplan as indicated in step S320, and the plan's estimated cost may beupdated to reflect this as indicated in step S330. For example, if theestimated cost for the plan incorrectly includes an additional cost forsolving this subgoal, it may be subtracted out. If the determination atstep S310 is negative, that is, if the subgoal is not yet solved,processing may proceed to step S340 whereat the planner may ascertain ifa method is specified for achieving the subgoal. For example, if thesubgoal is to clear a particular obstacle for a particular kind ofaction, the action class or the obstacle class may specify a method toclear such obstacles. If it does processing may proceed to step S350whereat the planner may automatically apply this method. If it does notprocessing may proceed to step S360, whereat the planner mayautomatically ascertain whether there is a default method for solvingsubgoals in this situation or of this type. If there is processing mayproceed to step S370, whereat the default method may be automaticallyapplied. If there is no default method, processing may proceed to stepS380 then the plan may fail.

Whenever a plan fails, it is removed from consideration, and the plannermay automatically back up to consider any remaining candidate plans. Ifno candidate plans remain, the planner may determine that it is unableto solve the design problem.

FIG. 4 shows an embodiment of a default method for processing a subgoalwhen it specifies an action.

First, a determination may be made as to whether the action isimpossible in a known unfixable way as indicated at step S410. This mayhappen in a number of ways. In one way, the action may be recognized assimply impossible. An example might be if the action called for walkingthrough a wall (in a domain in which walls were impermeable andunaffectable). This may be recognized by a method associated with theaction, or the obstacle, which recognizes when an action is impossible.In another way, the proposed action may be impossible because it createsa deadlock that could not be avoided by prior moves. This may berecognized by a specific deadlock recognition module. Deadlocks will befurther described below. In either of these cases and as indicated atstep S420 the plan fails.

If the determination result in step S410 is negative, processing mayproceed to step S430 whereat a determination is made as to whether theproposed action requires a prerequisite. If yes, processing may proceedto step S440 whereat in this case the subgoal of achieving theprerequisite(s) is added and the cost estimate for the plan may beupdated to reflect the additional subgoal as indicated at step S450.

Whenever a new subgoal is added to a plan, it may be added as the nextsubgoal, to be treated ahead of whichever subgoal caused it to be added.So for example, when as above the subgoal of achieving prerequisites foran action is added, because the subgoal of performing the action isbeing considered, said subgoal of achieving prerequisites may be addeddirectly before said subgoal of performing the action, meaning that theprerequisites will be fulfilled before the action is taken.

If the determination result in step S430 is negative, processing mayproceed to step S460 whereat a determination may be made as to whetherthe proposed action is blocked because of some potentially correctableobstacle. If yes, the planner may invoke a routine for handling markingas indicated by step S470 and may then invoke a clearing routine for theobstacle as indicated by step S480.

If the determination in step S460 is negative, that is, if the action isnot blocked by an obstacle, processing may proceed to step S490 whereata determination may be made as to whether implementing the plan willcreate a deadlock. If yes, the mark handling and clearing routines maybe invoked for the deadlock. If not, processing may proceed to stepS4100 whereat a make action routine may be invoked.

FIG. 5 shows an embodiment of a clearing routine (as may be applied instep S480). First as indicated in step S510, a relevant action ischosen. If no specific method is given for finding actions relevant to aparticular subgoal (in this case, clearing an obstacle or a deadlock) adefault method of defining relevant actions may be that any actionaffecting the obstacle or deadlock may be considered relevant, if it hasnot already been tried at this position in processing the plan. Next atstep S520 a new plan may be created which may be identical to thecurrent candidate plan, except with the subgoal of performing the chosenrelevant action inserted as next subgoal. Next at step S530 a costestimate may be assigned to this new plan. The cost estimate of the newplan may be the cost estimate of the modified plan plus an estimate ofthe cost of performing the additional inserted action, unless anestimate of this cost had previously been incorporated.

The clearing routine may be embodied in a non-deterministic way, inwhich case when invoked it may create a single new candidate plan. Inthis case, a random relevant previously unapplied action may be chosenin step S510, or the least cost or otherwise most promising remainingrelevant action may be chosen. In this case, if it can find no relevantactions in step S510, the clearing routine may simply exit (and thecandidate plan that invoked it may fail). Alternatively, the clearingroutine may loop (as shown by the dashed line) creating a number ofcandidate plans, one with each possible relevant action. In this case itwill exit when no further relevant actions exist.

Note that the planner may consider all relevant candidate plans (butavoid irrelevant candidate plans), as follows. Say, for example, thereis a subgoal of performing an action in a candidate plan that is beingprocessed, and this action is blocked by an obstacle. A clearing routinemay be invoked for this obstacle, which may then create a candidate planwith the subgoal of performing an action on the obstacle (to clear itout of the way). Say this action is blocked by another obstacle. Then,when said candidate plan with this subgoal inserted is processed, aclearing routine may be invoked for this second obstacle, which maycreate a candidate plan that first attempts to move it. Candidate plansmay in this way be created so long as the actions added are relevantbecause of some causal chain or link.

The first subgoal that is not yet resolved may be processed by creatingcandidate plans with an additional subgoal or subgoals inserted thatpropose actions relevant to accomplishing the unresolved subgoal. A nextaction may be relevant if it affects an obstacle preventing the nextaction in the plan. An obstacle might prevent such an action by simplybeing in the way (like a door in the example of FIG. 10). Another way anobstacle might be relevant to a next proposed action is by being part ofa deadlock created when the action is taken in the current position, aswill be discussed below.

The planner may iteratively work on the lowest cost candidate plan, andon the next element of that plan in time sequence until either it findsa plan that achieves the goals, or it runs out of proposed plans.Because it may iteratively work on the lowest estimated cost plan, whenit finds a plan it may find a low cost one. Because it may search onlyactions judged relevant, it may find a plan efficiently and rapidly.Because it may work in time-ordered fashion on the plans, and maintain asimulation of the position the plan has reached to that time, it mayjudge which actions are possible at any given point, enabling it toavoid considering positions that it does not know how to reach, oractions at such positions that may be impossible.

In the example of FIG. 10, the planner might first process a paththrough the locked door 8. A method specific to locked doors may suggestlooking in the drawers of a proximate desk, such as desk 9, for a key.If no key is found in the desk, this candidate plan may fail. Theplanner may then process an alternate path through the closed door 3. Ifit has no specific method for handling chairs, it may invoke a defaultmethod, and thus try actions that affect chair (2), such as moving iteast. It may then be able to open the closed door. It may then tryactions affecting chair (5), such as moving it north. It would find thisaction blocked by chair (4). It might then amend the plan by firsttrying an action moving chair (4) west, which would enable it to movechair (5) north. It might alternatively create a plan moving chair (5)south instead of north. Whichever of these plans was considered lesscostly (depending on details of the embodiment, it would in this casemost likely be the latter plan, since it involves less obstacles) may beworked on first. By working on the least estimated cost plan first, theplanner may avoid considering many potential plans.

FIG. 6 shows an embodiment of the mark handling routine invoked in stepS470 of FIG. 4. Marking and mark handling is an optional improvement onthe planner and may be omitted. If marking is employed, whenever anaction is actually made in simulation in a candidate plan (for exampleby the make action routine invoked in step S4100) the effects of thataction may be marked and a copy of the candidate plan as it was justbefore said action was made saved (as is indicated in step S820,discussed below). For example, an object that is moved by the action maybe marked. Then, if marked objects or effects impede a later subgoal, acopy of the saved plan may be retrieved and used to see if it would bepossible for a plan to avoid this problem by performing the actions in adifferent order.

Thus, when an obstacle or deadlock is encountered, a determination maybe made as to whether it is marked as indicated at step S610. If it is,the stored copy of the plan that marked the problem may be retrieved asindicated at step S620. A copy of this stored copy may be modified asindicated at step S630 by inserting as next subgoal the subgoal ofmaking the action encountering the problem. Thus this subgoal isinserted before the subgoal of performing the action that marked theobstacle. The cost for this new plan may be updated as indicated by stepS640, which may be the cost estimate of the stored plan plus a costestimate for the newly added subgoal, and this plan may be added to theset of candidate plans as indicated by step S650.

If the problem was marked by a plurality of previous plans, a new planmay be created and inserted in the plan set by modifying each suchstored plan in the same way.

FIG. 8 describes an embodiment of the make action routine, invoked instep S4100, the routine employed when a specified action has been foundto be makeable without further prerequisite, obstacle removal, ordeadlock prevention. The first step, as indicated at step S810, may beto store in memory the current simulated domain state and plan, so thatit will be possible to back up later to the point before the action wasmade. (This step S810 may be omitted if marking is not employed.)

The action may then be made in simulation as indicated at step S820 andeffects of the action may be marked by invoking a method associated withthe action. The action may also be appended to the list of actionsperformed by the plan (so that the list can be output in step S170 whena successful plan is generated.) The cost estimate for the plan may thenbe updated, which may be the actual cost of achieving the present pointplus estimate of achieving any remaining subgoals as indicated at stepS830. The planner may then check or determine at step S840 as to whetherthe world situation is now in an identical state in all relevant ways toa state achieved by a previously considered plan.

If so, a determination may be made at step S860 as to whether the costof getting here was more by this plan. If such determination is yes,this plan may fail as indicated at step S870. If the determination atstep S860 is negative, that is if the cost of getting here was less bythe previous plan, the cost associated with the stored state may beupdated to the new lower cost of reaching it as indicated at step S880.If the determination at step S840 is negative, that is the situation isnot identical in all relevant ways to a previously reached state, theworld situation may be stored in memory with the cost of achieving thisstate attached as indicated at step S850 so that the planner can checklater plans for duplication of state. Storage and checking may beaccomplished efficiently using the method of a hash table. Finally atstep S890 the planner may check to see if any new higher level candidateplans are now possible from the reached state. If so, it may add thesenew candidate plans as indicated at step S8100, with a cost estimatewhich may be the actual cost of reaching this point plus the estimatedcost of remaining subgoals along the new plan. These new plans may befound using, for example, the same approach that may have been used tofind the original set of candidate plans, such as that shown in FIG. 2(but invoked from the current state). These new candidate plans maybegin with the initial list of actions that has reached the currentsimulated domain state, but may have new additional subgoals as they areworked out to reach the final goal.

The planner may be supplied with a deadlock detector that detects localconfigurations that prevent any possible sequence of actions fromachieving the goals. Such configurations may, for example, consist ofcollections of affectable objects that obstruct each other in such a waythat they can not be moved. When the deadlock detector detects adeadlock after a simulated action, it returns the set of affectableobjects participating in the deadlock. Plans may then be added that haveas next subgoal (ahead of performing said simulated action) being tomove these obstacles so that said deadlock will not be encountered whensaid action is made.

FIG. 9 is a flowchart of an embodiment of a deadlock detector. Thedeadlock detector may comprise a number of subroutines or methodscapable of recognizing deadlocks. For example, such a method mayrecognize a particular configuration of objects that implies a deadlock,a state from which goals can not be solved. Such a method may, forexample, simply recognize a particular pattern of objects that is knownto cause a deadlock. Alternatively, it may do a detailed calculation,sometimes invoking a planning system, that determines a deadlock ispresent. The deadlock detector may scan the domain simulation in turnlooking for each type of deadlock.

In the embodiment of FIG. 9, one loops (from step S920 and below) overone's collection of known deadlock patterns. The scan may be initiatedwith the first pattern as indicated at step S910. For each pattern adetermination may be made at step S930 as to whether the pattern isfound in the domain simulation. If so and as indicated at step S940, theelements which comprise the deadlock may be reported and exit. If not,processing may proceed to step S950 whereat a determination may be madeas to whether there is a remaining pattern in one's collection. If yes,processing may be updated to the next pattern as indicated at step S970and processing may be returned to step S920 to scan with said nextpattern; or else when no patterns remain, no pattern found may bereported as indicated at step S960 and exit.

Such subroutines or patterns may be executable code supplied that isappropriate to the domain. Alternatively, the deadlock detector may beconstructed using an appropriate module constructor.

U.S. patent application Ser. No. 11/285,937, METHOD and SYSTEM FORCONSTRUCTING COGNITIVE PROGRAMS, which is incorporated herein byreference, describes a construction of programs using a component calleda module constructor that may take as inputs a collection of examplesand an objective function, or some other means of supplying a fitnessfunction, and/or a set of instructions, and return a program thatapproximately optimizes the objective function run on the examples, orfinds a sufficiently fit function, or else reports that it failed. Asdetailed therein, module constructors may be readily embodied usingtechniques or variants of techniques such as genetic programming,although other techniques may offer advantages in certain cases.

FIG. 11 shows an embodiment of a module constructor.

First at step S1110 a population of programs may be initiated in arandomized way. In one embodiment, a randomized population of programsmay be initiated by repeating n times the randomized construction of aprogram, for an appropriate population size n. One way each randomizedconstruction of a program may be accomplished is as follows. A firstinstruction may be chosen randomly from the instruction set. If thisinstruction has no arguments, the construction is done. Otherwise,instructions may be chosen from the instruction set for each of itsarguments. These choices are again at random from the instruction set,except that if the instructions are typed, instructions are chosenrandomly from among the instructions of appropriate type. This processmay be iterated until no instruction in the program has unfilledarguments, and at each step the probability of choosing thoseinstructions in the instruction set that do not have arguments(sometimes known as atoms) is increased, so that the process terminateswith programs that are on average of a size deemed appropriate.

Alternatively, the randomized creation of each program in the populationmay be accomplished by repeating a number of times the random selectionof an instruction and stringing the instructions into a list or anappropriate data structure for the particular programming language. Inan alternative embodiment, the programmers may enter one or moreprograms in the population, and the remaining programs in the populationmay be created as described above.

Proceeding with the discussion of FIG. 11 at step S1120, each program inthe population may be run on each of the examples. If a program fails toterminate within a given time bound on any example, it may be deemed tohave failed on that example and given a score of 0 for the example. Nextat step S1130, each program's performance may be scored on each exampleaccording to the objective function and an overall score for eachprogram on the examples may be accumulated. The programs may then besorted by score at step S1140. In step S1150 a determination may be madeas to whether the highest scoring program scores high enough to achievethe satisfaction criteria. If yes, it may be returned as indicated atstep S1160 and the module constructor terminates. If the highest scoringprogram does not satisfy the criteria, processing may proceed to stepS1170 whereat a determination may be made as to whether the total timeused has exceeded a timeout criteria. If yes, the module constructor mayterminate, returning failure as indicated at step S1180. Otherwise, aportion (such as half) of the population of programs scoring lowest maybe deleted as indicated at step S1190. Remaining high-scoring programsmay be duplicated, and one copy of each duplicate may be mutatedrandomly in step S11100. One way of mutating a program is to choose atrandom an instruction from the program, replace it with another randomlychosen instruction (of appropriate type if the language is typed), andgrow the program down from there. Alternatively (or in addition) newprograms may be formed by applying the crossover operation of geneticprogramming to two or more programs in the population, and such newprograms added to the population. Execution then returns to step S1120.

An alternate embodiment of a module constructor which may beparticularly appropriate for constructing deadlock recognitionsubroutines takes a known set of deadlock positions (for example,supplied by the user) and backs up to find other deadlock positions.This may be appropriate if the action-operators can be simulated inreverse, as is often the case. If one backs up from a deadlock positionto find a state such that some action-operator takes that state to thedeadlock position, then that state is a candidate deadlock position. Itmay be promoted to a known deadlock position if all action-operatorstake it to known deadlocks. The following steps may then be iterated:

-   -   start with a collection of known deadlocks,    -   create the set of backup states from the deadlocks,    -   test the backup states to see which are deadlocks, and    -   update the set of known deadlocks.

This may be iterated as many times as convenient to find a collection ofdeadlock states. These deadlocks may then be embodied in patterns andused to scan for deadlocks within a domain.

For problems where multiple goals must be satisfied simultaneously,computation of a planner may be greatly sped up if it is possible tofirst assess the order in which the goals should be addressed. It maybe, for example, that solving goal A will of necessity destroy asolution of goal B, in which case one should first attempt to achievegoal A, and then to achieve goal B. FIG. 12 illustrates a flowchart fora method to find constraints on the order in which goals are solved whenthere are multiple goals. The planner may cycle through pairs of goals A(step S1210) and B (step S1220), and for each pair try to solve A giventhe constraint that B remains solved (step S1230). One may branch onwhether this can be done or not (step S1240). If so, then the constraintthat A must be solved before B is solved (for the last time) may beadded to a list of such constraints (step S1250). Otherwise, at stepS1260 a determination may be made as to whether any pairs were notconsidered. If yes, processing may loop back to consider remaining pairsof goals, and when done, output a list of constraints (step S1270). Suchconstraints may then be employed by only considering candidate plans tosolve the goals in the constrained order.

Other domain specific methods may be supplied for analyzing goalordering, and used to generate constraints on plans considered.

FIG. 7 shows a flowchart of how any of the methods disclosed herein, ormodules embodying these methods, may be used in an automatic orevolutionary programming algorithm or within module constructors, suchas the procedures discussed in U.S. patent application Ser. No.11/285,937 “Method and System for Constructing Cognitive Programs”,incorporated herein by reference. As remarked therein and above withreference to FIG. 11, module constructors (or automatic programmingalgorithms) may be given a set of instructions and then may constructmodules or programs out of the instructions to solve problems or achievedesign goals. The planner and/or the methods discussed in thisapplication and/or modules built on top of them may be used withinmodule constructors. This may be done in the following manner. First, aplanner may be constructed that solves a class of design goals in adomain using the above mentioned or described methods. This is then aprogram that may solve a class of problems within a domain. In stepS710, this program may be input into a module constructor. The moduleconstructor may then construct at step S720 a program to solve anotherdesign problem using this planner as an instruction within theinstruction set out of which it constructs the program. Finally at stepS730 the constructed program is output.

For example, consider the office domain previously discussed. A programmay be written named move(x,y) that uses the methods disclosed herein tocalculate how to get from any point x to any point y and that may have aside effect of moving a simulated man to that point. Now move(x,y) maybe supplied as an instruction to a module constructor that thenconstructs a program to solve another class of design goals or tomaximize a supplied objective function. For example, if the moduleconstructor diagrammed in FIG. 11 were used, the instruction move(.,.)may be utilized as one of the instructions out of which initial programsare constructed in step S1110 and as an instruction used to replaceother instructions in mutation operations in step S11100. In an officedomain example, the module constructor may be used to find a programthat can arrange furniture in an office in such a way that workers mayefficiently move around to rapidly perform some class of supplied tasks.The module constructor may use move(x,y) as well as other instructionsand automatically build or evolve a program to solve such problems. Thefinal output program may then be efficiently able, given a simulation ofan office, to arrange the furniture in a desirable fashion. Given anyoffice architecture (embodied in a simulation program) it simply usesthe architecture within the planner, and uses the planner (as well asthe simulation program) within the module it has constructed.

FIG. 13 illustrates a system 2000 which may be usable with theabove-described programs, planners, modules, scaffolds, instructions,libraries, module constructors, CAD tools, methods, classes and/or othertools. Such system may include an input device 2002, a computer 2004, amemory 2006, and a display unit 2011 which may be coupled together asshown in FIG. 13.

The input device 2002 may enable a user or operator to enter data intothe computer 2004. Such input device may be a computer keyboard, amouse, a writing tablet, or other types of data entry devices. Such userinput data may be a number of examples, a number of functions, a numberof instructions, a number of satisfaction criteria, and/or a number ofsimulation environments. The display unit 2011 may enable data to bedisplayed to the operator.

The computer 2004 may include a memory 2007 and a processor 2009. Thememory 2007 may have stored therein programs for use with the presentinvention. For example, this memory may contain a number of modules,methods, scaffolds, classes, instructions, subprograms, libraries, andso forth which may be used to create the desired program in a manner aspreviously described. The processor 2009 may be operable to performand/or control operations used in creating the desired program. Suchoperations may include receiving and processing user data supplied fromthe input device 2002, obtaining a number of subprograms in accordancewith the received user data, creating the desired program based on theobtained subprogram or subprograms, and/or running the created programto solve the problem. These operations may also include enabling theproblem to be divided into a plurality of subproblems. The subprogramsmay be obtained from programs previously stored in memory or,alternatively, may be obtained from running a stored subprogram orsubprograms utilizing the user input data.

The computer 2004 may be operable to receive a portable type memory 2006such as a disc, semiconductor memory, or the like. Such memory 2006 maybe operable to have all or any part of the above-described programs,subprograms, modules, methods, classes and/or scaffolds stored therein.

Furthermore, the computer 2004 may be coupled to a network 2030 by wayof a connection such as a bus 2012 or, alternatively, by wireless means.Additionally, such network may be the Internet and may include a numberof other computers such as computers 2008, 2010 and so forth. As aresult of such arrangement, the computer 2004 may be able to communicatewith a number of other computers during its operations and/or may beable to use information from such other computers.

As an example of the operation of the present invention, reference ismade to FIGS. 10 and 13. A user may supply information pertaining to adomain by use of input 2002, or a portable memory device 2006. Suchinformation may be stored in memory 2007 and supplied to the processor2009, or by way of bus 2012 (or wireless communication) to a network2030 to the processor or processors in the other computers 2008 and2010. Such information may include simulation programs or simulationmodules, item(s), classes of items or objects, classes of operators,classes of domains, methods associated with operators or classes ofoperators, methods associated with objects or classes of objects,methods associated with a particular domain or a class of domains,methods for clearing particular obstacles or particular classes ofobstacles to said operators or classes of operators, methods associatedwith domains or classes of domains for simulating the effects ofoperators or classes of operators, methods for recognizing whenoperators are enabled, methods for simulating the effects of operators,methods for simulating the effects that operators would have ifobstacles were removed, methods for specifying states wherein operatorsmight be applicable if affectable obstacles were removed, methods forrecognizing when obstacles or objects may be affected by operators andmethods for recognizing when obstacles or objects are not affectable byany available operators, methods for recognizing deadlocks, classes ofdeadlocks, deadlock patterns, classes of goals and methods associatedwith goal classes, wherein all such methods, classes, modules orprograms may be embodied in executable code. With reference to theexample of FIG. 10, for example, as previously discussed, suppliedoperators may include operators stepping a simulated man one meternorth, south, east, or west, and methods that specify where saidsimulated man would be considered to arrive (even if an obstacle such asa closed door required correction before the step would be possible),methods for identifying chairs as moveable objects and for simulatingoperators moving chairs one meter north, south, east, or west and amethod associated with the class of locked doors that searches throughdrawers on nearby desks for a key, and then tries to unlock the doorwith keys that are found, and if successful opens the door. Such classesand methods, all of which may be embodied in executable code, may beinput into the computer's memory by a user.

The user or other users may subsequently input an initial state for thedomain, and a design goal or goals, and the present invention may thenby use of the computer 2004, including its processor and memory,automatically return a plan for achieving the design goal or goalswithout any additional inputs from the user or users. For example, onceappropriate methods and modules have been input for office navigationproblems, the user may input a map of an office, an initial position,and a goal position, and the program may automatically provide a planfor getting from the initial position to the goal position without anyadditional input or inputs from the user or users or may report that nosuch plan is feasible.

The following references and all the references referenced therein areherein incorporated by reference:

Baum, E. B. (2005) Methods and Apparatus for Planning and For Use ofPlanning in Cognitive Programs, U.S. Appln. No: 60/671,660; Baum, E. B.(2005) U.S. patent application Ser. No. 11/285,937, METHOD AND SYSTEMFOR CONSTRUCTING COGNITIVE PROGRAMS; Baum, E. B. (2004) “What isThought?” MIT Press, Cambridge Mass.; Baum, E. B., Durdanovic, I. (2000)“An Artificial Economy of Post Production Systems in Advances inLearning Classifier Systems: Third International Workshop,” IWLCS 2000ed P. L. Lanzi, W. Stoltzmann, and S. M. Wilson 3-21 Berlin:Springer-Verlag; Baum, E. B., Durdanovic, I. (2000) “Evolution ofCooperative Problem Solving in an Artificial Economy,” NeuralComputation 12 (12): 2743-2775;

Andreas Junghanns and Jonathan Schaeffer, Sokoban: Enhancing GeneralSingle-Agent Search Methods Using Domain Knowledge, ArtificialIntelligence, vol. 129, no. 1-2, pp. 219-251, 2001 and Stuart Russelland Peter Norvig Artificial Intelligence, a Modern Approach, PrenticeHall, Englewood Cliffs N.J. (1995).

Although the invention herein has been described with reference toparticular embodiments, it is to be understood that these embodimentsare merely illustrative of the principles and applications of thepresent invention. It is therefore to be understood that numerousmodifications may be made to the illustrative embodiments and that otherarrangements may be devised without departing from the spirit and scopeof the present invention as defined by the appended claims.

What is claimed is:
 1. A system for achieving a desired goal in adomain, in which the domain has one or more operators associatedtherewith, said system comprising: means for receiving informationpertaining to the domain and for simulating the domain therefrom; meansfor simulating one or more effects due to the one or more operatorsassociated with the domain; means for specifying a number of itemsand/or a number of classes of items in the domain and whether each itemand/or each class of items is an affectable obstacle wherein at leastone of the one or more operators can cause a change thereto or anon-affectable obstacle wherein the one or more operators can not causea change thereto; means for automatically generating a candidate plan toachieve the desired goal by utilizing the simulated domain and thesimulated effect(s), wherein the candidate plan could involve one ormore affectable obstacles but does not involve any non-affectableobstacles; and means for automatically refining the candidate plan tochange at least one of the affectable obstacles involved in thecandidate plan.